Search CORE

80 research outputs found

An Exchange Mechanism to Coordinate Flexibility in Residential Energy Cooperatives

Author: Chakraborty Shantanu
Hernandez-Leal Pablo
Kaisers Michael
Publication venue
Publication date: 13/02/2019
Field of study

Energy cooperatives (ECs) such as residential and industrial microgrids have the potential to mitigate increasing fluctuations in renewable electricity generation, but only if their joint response is coordinated. However, the coordination and control of independently operated flexible resources (e.g., storage, demand response) imposes critical challenges arising from the heterogeneity of the resources, conflict of interests, and impact on the grid. Correspondingly, overcoming these challenges with a general and fair yet efficient exchange mechanism that coordinates these distributed resources will accommodate renewable fluctuations on a local level, thereby supporting the energy transition. In this paper, we introduce such an exchange mechanism. It incorporates a payment structure that encourages prosumers to participate in the exchange by increasing their utility above baseline alternatives. The allocation from the proposed mechanism increases the system efficiency (utilitarian social welfare) and distributes profits more fairly (measured by Nash social welfare) than individual flexibility activation. A case study analyzing the mechanism performance and resulting payments in numerical experiments over real demand and generation profiles of the Pecan Street dataset elucidates the efficacy to promote cooperation between co-located flexibilities in residential cooperatives through local exchange.Comment: Accepted in IEEE ICIT 201

arXiv.org e-Print Archive

CWI's Institutional Repository

Learning against sequential opponents in repeated stochastic games

Author: Hernandez-Leal P. (Pablo)
Kaisers M. (Michael)
Publication venue
Publication date: 01/07/2017
Field of study

CWI's Institutional Repository

Agent Modeling as Auxiliary Task for Deep Reinforcement Learning

Author: Hernandez-Leal Pablo
Kartal Bilal
Taylor Matthew E.
Publication venue
Publication date: 22/07/2019
Field of study

In this paper we explore how actor-critic methods in deep reinforcement learning, in particular Asynchronous Advantage Actor-Critic (A3C), can be extended with agent modeling. Inspired by recent works on representation learning and multiagent deep reinforcement learning, we propose two architectures to perform agent modeling: the first one based on parameter sharing, and the second one based on agent policy features. Both architectures aim to learn other agents' policies as auxiliary tasks, besides the standard actor (policy) and critic (values). We performed experiments in both cooperative and competitive domains. The former is a problem of coordinated multiagent object transportation and the latter is a two-player mini version of the Pommerman game. Our results show that the proposed architectures stabilize learning and outperform the standard A3C architecture when learning a best response in terms of expected rewards.Comment: AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE'19

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Action Guidance with MCTS for Deep Reinforcement Learning

Author: Hernandez-Leal Pablo
Kartal Bilal
Taylor Matthew E.
Publication venue
Publication date: 25/07/2019
Field of study

Deep reinforcement learning has achieved great successes in recent years, however, one main challenge is the sample inefficiency. In this paper, we focus on how to use action guidance by means of a non-expert demonstrator to improve sample efficiency in a domain with sparse, delayed, and possibly deceptive rewards: the recently-proposed multi-agent benchmark of Pommerman. We propose a new framework where even a non-expert simulated demonstrator, e.g., planning algorithms such as Monte Carlo tree search with a small number rollouts, can be integrated within asynchronous distributed deep reinforcement learning methods. Compared to a vanilla deep RL algorithm, our proposed methods both learn faster and converge to better policies on a two-player mini version of the Pommerman game.Comment: AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE'19). arXiv admin note: substantial text overlap with arXiv:1904.05759, arXiv:1812.0004

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Terminal Prediction as an Auxiliary Task for Deep Reinforcement Learning

Author: Hernandez-Leal Pablo
Kartal Bilal
Taylor Matthew E.
Publication venue
Publication date: 24/07/2019
Field of study

Deep reinforcement learning has achieved great successes in recent years, but there are still open challenges, such as convergence to locally optimal policies and sample inefficiency. In this paper, we contribute a novel self-supervised auxiliary task, i.e., Terminal Prediction (TP), estimating temporal closeness to terminal states for episodic tasks. The intuition is to help representation learning by letting the agent predict how close it is to a terminal state, while learning its control policy. Although TP could be integrated with multiple algorithms, this paper focuses on Asynchronous Advantage Actor-Critic (A3C) and demonstrating the advantages of A3C-TP. Our extensive evaluation includes: a set of Atari games, the BipedalWalker domain, and a mini version of the recently proposed multi-agent Pommerman game. Our results on Atari games and the BipedalWalker domain suggest that A3C-TP outperforms standard A3C in most of the tested domains and in others it has similar performance. In Pommerman, our proposed method provides significant improvement both in learning efficiency and converging to better policies against different opponents.Comment: AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE'19). arXiv admin note: text overlap with arXiv:1812.0004

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications